Search CORE

28 research outputs found

Model Reduction and Neural Networks for Parametric PDEs

Author: Bhattacharya Kaushik
Hosseini Bamdad
Kovachki Nikola B.
Stuart Andrew M.
Publication venue
Publication date: 07/05/2020
Field of study

We develop a general framework for data-driven approximation of input-output maps between infinite-dimensional spaces. The proposed approach is motivated by the recent successes of neural networks and deep learning, in combination with ideas from model reduction. This combination results in a neural network approximation which, in principle, is defined on infinite-dimensional spaces and, in practice, is robust to the dimension of finite-dimensional approximations of these spaces required for computation. For a class of input-output maps, and suitably chosen probability measures on the inputs, we prove convergence of the proposed approximation methodology. Numerically we demonstrate the effectiveness of the method on a class of parametric elliptic PDE problems, showing convergence and robustness of the approximation scheme with respect to the size of the discretization, and compare our method with existing algorithms from the literature

arXiv.org e-Print Archive

The SMAI journal of computational mathematics

Numérisation de Documents Anciens Mathématiques

Caltech Authors

Analysis Of Momentum Methods

Author: Kovachki Nikola B.
Stuart Andrew M.
Publication venue
Publication date: 10/06/2019
Field of study

Gradient decent-based optimization methods underpin the parameter training which results in the impressive results now found when testing neural networks. Introducing stochasticity is key to their success in practical problems, and there is some understanding of the role of stochastic gradient decent in this context. Momentum modifications of gradient decent such as Polyak's Heavy Ball method (HB) and Nesterov's method of accelerated gradients (NAG), are widely adopted. In this work, our focus is on understanding the role of momentum in the training of neural networks, concentrating on the common situation in which the momentum contribution is fixed at each step of the algorithm; to expose the ideas simply we work in the deterministic setting. We show that, contrary to popular belief, standard implementations of fixed momentum methods do no more than act to rescale the learning rate. We achieve this by showing that the momentum method converges to a gradient flow, with a momentum-dependent time-rescaling, using the method of modified equations from numerical analysis. Further we show that the momentum method admits an exponentially attractive invariant manifold on which the dynamic reduces to a gradient flow with respect to a modified loss function, equal to the original one plus a small perturbation

Caltech Authors

Ensemble Kalman Inversion: A Derivative-Free Technique For Machine Learning Tasks

Author: Kovachki Nikola B.
Stuart Andrew M.
Publication venue: 'IOP Publishing'
Publication date: 10/08/2018
Field of study

The standard probabilistic perspective on machine learning gives rise to empirical risk-minimization tasks that are frequently solved by stochastic gradient descent (SGD) and variants thereof. We present a formulation of these tasks as classical inverse or filtering problems and, furthermore, we propose an efficient, gradient-free algorithm for finding a solution to these problems using ensemble Kalman inversion (EKI). Applications of our approach include offline and online supervised learning with deep neural networks, as well as graph-based semi-supervised learning. The essence of the EKI procedure is an ensemble based approximate gradient descent in which derivatives are replaced by differences from within the ensemble. We suggest several modifications to the basic method, derived from empirically successful heuristics developed in the context of SGD. Numerical results demonstrate wide applicability and robustness of the proposed algorithm.Comment: 41 pages, 14 figure

arXiv.org e-Print Archive

Caltech Authors

Conditional Sampling With Monotone GANs

Author: Baptista Ricardo
Hosseini Bamdad
Kovachki Nikola
Marzouk Youssef
Publication venue
Publication date: 19/02/2021
Field of study

We present a new approach for sampling conditional probability measures, enabling consistent uncertainty quantification in supervised learning tasks. We construct a mapping that transforms a reference measure to the measure of the output conditioned on new inputs. The mapping is trained via a modification of generative adversarial networks (GANs), called monotone GANs, that imposes monotonicity and a block triangular structure. We present theoretical guarantees for the consistency of our proposed method, as well as numerical experiments demonstrating the ability of our method to accurately sample conditional measures in applications ranging from inverse problems to image in-painting

arXiv.org e-Print Archive

Ensemble Kalman Inversion: A Derivative-Free Technique For Machine Learning Tasks

Author: Kovachki Nikola B.
Stuart Andrew M.
Publication venue: 'AIP Publishing'
Publication date: 01/09/2019
Field of study

The standard probabilistic perspective on machine learning gives rise to empirical risk-minimization tasks that are frequently solved by stochastic gradient descent (SGD) and variants thereof. We present a formulation of these tasks as classical inverse or filtering problems and, furthermore, we propose an efficient, gradient-free algorithm for finding a solution to these problems using ensemble Kalman inversion (EKI). The method is inherently parallelizable and is applicable to problems with non-differentiable loss functions, for which back-propagation is not possible. Applications of our approach include offline and online supervised learning with deep neural networks, as well as graph-based semi-supervised learning. The essence of the EKI procedure is an ensemble based approximate gradient descent in which derivatives are replaced by differences from within the ensemble. We suggest several modifications to the basic method, derived from empirically successful heuristics developed in the context of SGD. Numerical results demonstrate wide applicability and robustness of the proposed algorithm

Model Reduction and Neural Networks for Parametric PDEs

Author: Bhattacharya Kaushik
Hosseini Bamdad
Kovachki Nikola B.
Stuart Andrew M.
Publication venue
Publication date: 07/05/2020
Field of study

Learning Homogenization for Elliptic Operators

Author: Bhattacharya Kaushik
Kovachki Nikola
Rajan Aakila
Stuart Andrew M.
Trautner Margaret
Publication venue
Publication date: 07/07/2023
Field of study

Multiscale partial differential equations (PDEs) arise in various applications, and several schemes have been developed to solve them efficiently. Homogenization theory is a powerful methodology that eliminates the small-scale dependence, resulting in simplified equations that are computationally tractable. In the field of continuum mechanics, homogenization is crucial for deriving constitutive laws that incorporate microscale physics in order to formulate balance laws for the macroscopic quantities of interest. However, obtaining homogenized constitutive laws is often challenging as they do not in general have an analytic form and can exhibit phenomena not present on the microscale. In response, data-driven learning of the constitutive law has been proposed as appropriate for this task. However, a major challenge in data-driven learning approaches for this problem has remained unexplored: the impact of discontinuities and corner interfaces in the underlying material. These discontinuities in the coefficients affect the smoothness of the solutions of the underlying equations. Given the prevalence of discontinuous materials in continuum mechanics applications, it is important to address the challenge of learning in this context; in particular to develop underpinning theory to establish the reliability of data-driven methods in this scientific domain. The paper addresses this unexplored challenge by investigating the learnability of homogenized constitutive laws for elliptic operators in the presence of such complexities. Approximation theory is presented, and numerical experiments are performed which validate the theory for the solution operator defined by the cell-problem arising in homogenization for elliptic PDEs

arXiv.org e-Print Archive